arXiv: 1501.02323v 1 [cs.SI] 10 Jan 2015 


Context dependent preferential attachment model 

for complex networks 

Pradumn Kumar Pandey and Bibhas Adhikari 


Abstract —In this paper, we propose a growing random com¬ 
plex network model, which we call context dependent preferential 
attachment model (CDPAM), when the preference of a new node 
to get attached to old nodes is determined by the local and global 
property of the old nodes. We consider that local and global 
properties of a node as the degree and relative average degree of 
the node respectively. We prove that the degree distribution of 
complex networks generated by CDPAM follow power law with 
exponent lies in the interval [2, 3] and the expected diameter 
grows logarithmically with the size of new nodes added in the 
initial small network. Numerical results show that the expected 
diameter stabilizes when alike weights to the local and global 
properties are assigned by the new nodes. Computing various 
measures including clustering coefficient, assortativity, number 
of triangles, algebraic connectivity, spectral radius, we show that 
the proposed model replicates properties of real networks better 
than BA model for all these measures when alike weights are 
given to local and global property. Finally, we observe that the 
BA model is a limiting case of CDPAM when new nodes tend to 
give large weight to the local property compared to the weight 
given to the global property during link formation. 

Index Terms —context dependent preferential attachment, de¬ 
gree, relative average degree, clustering coefficient, assortativity, 
number of triangles, algebraic connectivity, spectral radius, 
diameter. 


I. Introduction 

Modelling complex networks has been an active area of 
research in literature due to its applications in various field 
of science and technology did mi). Several attempts have 
been made to generate deterministic and random complex 
network models which can capture the spirit of several large 
scale real world networks such as social networks |5), biolog¬ 
ical networks m, technological networks Qetc. Two prime 
characteristics of a large class of real networks that have 
been observed and established by leading scholars in the 
area of complex networks are power-law degree distribution 
of the nodes and small-world behavior of the networks H] 
ll9ll ifTOll ifTTTl lfl2ll . The Erdos-Renyi (ER) model f]~3l is one of 
the first initiatives to generate random networks where the 
links are made by following a random procedure when a fixed 
number of nodes is chosen at the initial stage of the network 
formation. However, later it has been observed that ER model 
fails to represent the essence of real networks, for example, 
degree distribution is not a power-law. Consequently, a lot of 
interest has been generated to produce networks having power- 
law degree distributions. 
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One of the insightful growing random complex network 
models is proposed by Barabasi et al. in 1999, also called 
BA-model DU In this model, a small network is chosen 
in the beginning of the method, then new nodes appear and 
get linked with the existing nodes in a probabilistic fashion 
which is decided by the property (degree) of the existing 
nodes DU, DU- The philosophy adopted here is that at 
each iteration, the new nodes prefer to get attached with an 
old node which has high degree (among all existing ones) 
which sometimes represent the importance of a node in social 
context. Interestingly, the network generated by this model has 
power-law degree distribution and thus the concept of scale- 
free networks emerged. In his seminal paper DU, Barabasi 
et al. have also predicted that the growth and preferential 
attachment are jointly responsible for the emergence of the 
scale-free property in real networks. It has also been shown 
that the diameter grows approximately logarithmically with 
the size of the network. 

Does a new node always wish to form links with important 
(high degree) nodes or the choice get influenced by other 
factors also? Moreover, if the choice gets influenced by other 
properties of the existing nodes, will the network be having 
power-law degree distribution? An evidence of a phenomena 
that peoples choice does not depend on only one property is 
given in DU supported by an empirical data (see |16][17]|18] 
also). The data shows that at the time of purchasing a product, 
a buyer considers the background (history) of the product 
and relative attractiveness of the product with respect to other 
products in the same reference. Thus, the concept of context 
preferential attachment was introduced in Ga¬ 
in this paper, we propose a growing random complex 
network model where the probability of link formation is 
determined by weighted local and global property of the 
existing nodes. We consider that local and global properties of 
a node are given by the degree and relative average degree of 
the node in a network. Thus, we call the proposed model, the 
context dependent preferential attachment model (CDPAM) 
for complex networks. We prove that the degree distribution 
of complex networks generated by CDPAM follow power law 
P(k) = L(k)k~ ry where 2 < 7 < 3 and L(k) a (a. 
constant which depends on the weights given on local and 
global property of the nodes) as k -A oc. We also prove 
that the expected diameter grows logarithmically with the 
size of the new nodes added in the network, however the 
growth of the expected diameter is slower than that of the 
BA model. However, our numerical simulations show that the 
expected diameter stabilizes when alike weights are given to 
the local and global property which determine the preference 


2 


of link formation. In contrast to the conventional wisdom 
that diameter shows as a function of ln(lnTV) or In N in 
real networks, the authors in ED observed that the diameter 
stabilizes or shrinks as a network grows. The proposed model 
reveals how shrinking and increasing of diameter are related to 
the weights on local and global property of the nodes during 
expansion of the network. 

A variety of mathematical and statistical measures have 
been proposed in the literature in order to characterize global 
and local structure of complex networks. We derived clus¬ 
tering coefficient, assortativity, number of triangles, algebraic 
connectivity, spectral radius for different complex networks 
generated by CDPAM and compare them with the same 
obtained from the complex network generated by BA model. 
We show that our model replicates properties of real networks 
better than BA model for all these measures when alike 
weights are given to local and global property. Finally, we 
observe that the BA model is a limiting case of CDPAM 
when new nodes tend to give large weight to the local property 
compared to the weight given on the global property during 
link formation. 


II. Context dependent preferential attachment 
model (CDPAM) 


In this section, we propose a random complex network 
model which relies on the fact that the network is open i.e. a 
network continuously grows in time with the addition of new 
nodes in to a fixed small network chosen in the beginning 
of the process ll20l . It is important to notice that the link 
formation in BA model is biased as the link formation depends 
only on the high degree (importance) of the existing nodes. 
However, in real life we prefer to form relationship (link) with 
important (global property) people in society but also give 
importance to background (local property) of the people before 
making the relation. Inspired by this thought, we introduce the 
model as follows. 

1) Growth: Starting with a small network having mo nodes, 
at every timestep we add a new node with m < mo 
edges and the new nodes get linked with the nodes 
already present in the network. 

2) Context preferential attachment: Assume that N(t) de¬ 
notes the node set of the network after t- time step. When 
a new node j appears at time t +1 would get connected 
to node i E N(t) with probability p l -{t + 1) given by 


Pj(t + 1 ) 


PfB(i) + Og(i,N(t)) 


( 1 ) 


where /s(i) quantifies the background (local context) 
of node i, g(i,N(t)) determines the relative advantage 
(global context) of a nodes over others in the network 
N(t), and /?, 0(< /?) are the positive control parameters 
for the property of the nodes in N(t). 

In order to simplify the model, we consider 


/b(*) = ki andg(i, N(t)) = 


Xieiv(t) ki h 

\m\ 


where ki denotes the degree of a node i and \N(t)\ is the 
number of nodes in N(t). As we consider that a single node 
appears at each timestep, after time t there will be t+mo nodes 
in the network and for a large value of £(^> mo), |iV(t)| ~ t. 
Consequently, we have 


p l j(t + 1) 


Ph + k \ kl 

E qu i (t+m 0 )/c z -2mt-mo(m 0 -l) 

leN(t) 2^leN(t) t 

f3ki + 6{ki — 2m) 

2 m/3t 


for a very small value of mo- Assuming ki to be a contin¬ 
uous real variable function and the rate of change of ki is 
proportional to pj (t) , we have 

dki fiki + 6 (hi - 2m) 

~at = m -2® 


by applying mean field theory. 

The degree distribution of the network generated by the 
CDPAM is provided in the following theorem. 

Theorem 2.1: The degree distribution of a complex net¬ 
work generated by CDPAM described above exhibits a power 
law in their tail given by P(k) = L(k)k~ lr where L(k) -A 
(7 - l)(m - c)^- 1 ) as k -A 00 and 7 = 1 + = |qp|. 

In particular, 7 « 2 if /3 « 6 and 7 « 3 if /3 6. 

Proof: From ([2]) we have 

dki (3ki + 6(ki — 2m) ki — c 

- = 777,---— — - 

dt 2 m/3t (7 — 1 )t 

solving which we obtain 

/ l \ VG-i) 

hit) = (m - c) I- j +c (3) 


when the initial condition is given by fc,(to) = m - This yields 


P(ki(t) <k) = P(U > (to - c) 7 “A - c) 1_ V). 


Assuming ki(t') < k, we have U > (m — c) 7_1 (fc — c) 1_7 t. 

Further, since it is assumed that a single node gets added at 
each timestep, it is equivalent to a uniform distribution of U, 
given by P(ti) = l/(mo + 1). Consequently, 


P(ki(t) < k) 


= P(ti > (m — c) 
= 1 — rr—(m — 

t+mo v 


7 - 1 (fc 

c)' y ~ 1 (k — c) 1-7 


The degree distribution is obtained by 


P(k) = 


dP(ki(t) < k ) 
dk 


rx—(7-1 )(m-c) 7 1 {k c) 7 . 
t + m 0 


The desired result for degree distribution follows from the fact 
that t -A 00. 

Setting the initial network the complete network with 7 
nodes, i.e. mo = 7 and m = 5, we plot degree distributions of 
complex networks generated by CDPAM for different values 
of /3 and 7 in Fig [T] We also calculate the p- value which 
is a measure of goodness-of-fit based on KS statistics, to 
validate the power-law degree distribution of the networks 
Il9l . The numerical simulations show that the exponent 7 is 
an increasing function of (3 when 0 is fixed. 
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(a) 0 = 0.6, 0 = 0.5 


(b) 0 = 1.2, 0 = 0.5 



(c) 0 = 1.8, 0 = 0.5 



(d) 0 = 2.4, 9 = 0.5 



(e) 0 = 3.0, 0 = 0.5 


(f) 0 = 6, 0 = 0.5 



,0 s= 0.5 


(h) 0 = 300, 6 = 0.5 



(i) 0 = 600, 6 = 0.5 (j) 0 = 600000, 9 = 0.5 

Fig. 1: Degree distribution 
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7 ( calculated Numerically) 

p-value 

7 ( Theoretical) 

0.6 

1.94 

0.170 

2.090 

1.2 

2.50 

0.090 

2.411 

1.8 

2.62 

0.220 

2.565 

2.4 

2.70 

0.490 

2.655 

3.0 

2.78 

0.135 

2.714 

6 

2.84 

0.025 

2.846 

60 

2.82 

0.600 

2.980 

300 

2.82 

0.996 

2.996 

600 

2.82 

0.290 

2.998 

600000 

2.81 

0.017 

2.999 


TABLE I: Network parameters 


In order to show that the diameter of complex network 
constructed by the CDPAM is small, we proceed as follows. 
Let the node i and j appeared in the network at time t, and 
tj respectively. Assume that U < t j. Then the probability of 
the node j to be linked with the node i is given by 


Pj = m- 


/ 3ki(tj ) + 0(ki(tj) - 2m) 


2m/3tj 


where ki(tj) = (m — c) 

of the node i at time tj. Thus, 




+ c (see (30 is the degree 


m — c 


Pj = 


(7 _l K V(7-lM-l/(7-l) 


+ 


m(l - 29) 
2 /3tj 


(4) 


Remark 2.2: It is evident from the above derivation that 
the control parameters /3 and 0 which represent weights to the 
local and global property of the existing nodes respectively, 
determine the topology of the network generated by CDPAM. 
A natural question would be: Does there exist a functional re¬ 
lation between these parameters? To investigate how different 
values of these parameters affect the topology of the network, 
we fix the parameter 6 and vary (3 in the sequel. Thus, now 
onward we set 0 = 0.5. 

We recall the following lemma from EE 
Lemma 2.3: If Ai,A 2 ,...A n are mutually independent 
events and their probabilities full fill the relations P(Ai) < e 
for all i then 


P[\jA i )=l-exp\ ~J2 P ( A i) - Q 


i=1 


i =1 


where 0 < Q < YTjtl( ne ) j /3 ] - — (1 + e) n . 

Assume that N(t) denotes the set of all nodes which have 
been added in the network up to timestep t. In the network 
generated by CDPAM, assume that the nodes i, j G N(t) are 
connected by a path (i, vi, v 2 , ..., t’z-i, j) of length l where 
Vk G N(t) for all k = 1 : l — 1. Consider that this sequence is 
a single event A^. The total number of such events possible 
is |7V(t)|^ -1 . Thus, as given in EH, the probability of the 
existence of a path between i and j of length not more than l 
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is given by 


w)i i_i 

Pij(i) = p\ u A * 

k=1 


= 1 — exp 


|AT(i)| \N(t)\ 

- £ ••• Z pT 

Vi = l Vi- 1 = 1 


(5) 


We use this result to obtain the following corollary. 
Corollary 2.4: The probability of the existence of a path 
between two vertices i,j G N(t) of length not more than l is 
given by 


Pij(l) = 1 - exp 


K l Hl~ 


where K = 0WSK^-c) , 


t 1 /(7- 1 ) t 1 - 1 /(7- 1 ) 


H„ = Y}k=t }l \ and c is § iven ir 

theorem 12.11 
Proof: Using ([4]) and ([5} the result follows. 

Corollary 2.5: The expected value l h j of the distance be¬ 
tween two nodes i, j G N(t) is given by 


hj — 


1 


7-1 


In tj H-ln^ - log If - r 

7 — 1 1 

ln(KF„) + 2 ‘ 


Proof: The result follows from the fact that 


oo 

hi ='Em 

1=0 

where F(l) = 1 — Pij(l) (see (21)). 

Observe in Corollary |2.5| that the expected distance l l3 
between two nodes G N(t) is an increasing function of 
ti and tj when other parameters are fixed. This implies that 
the diameter of the network is the expected distance between 
the first node and the last node added in the network. Hence, 
setting tj = |TV(t)| and ti = 1 we obtain the following result. 

Corollary 2.6: The expected diameter of a complex net¬ 
work generated by CDPAM is given by 


D = 


(l~-A-A)]n\N(t)\-hiK-r 1 

In (KHn) + 2 


Thus it follows from the above corollary that the expected 
diameter of the network depends on the logarithmic value of 
the size of new nodes added in the network. In Fig [2] we calcu¬ 
lated the expected diameter for CDPAM and the approximate 
diameter given by BA model In TV/ In In TV) l22l . However, 
numerical simulations show that the expected diameter of 
CDPAM stabilizes when alike weights are assigned to both 
the local and global properties which determine the preference 
of link formation. In contrast to the conventional wisdom that 
diameter is a function of In (In TV) or In TV in real networks, the 
authors in CD observed that the diameter stabilizes or shrinks 
as a network grows. The CDPAM reveals how shrinking and 
increasing of diameter are related to the weights on local and 
global property of the nodes during expansion of the network. 



— beta=0.6 
— beta=0.7 
—*— beta=l 
" beta=100 
—t— BA model 


Fig. 2: Diameter growth of networks Horizontal-axis represents the In IV 
and Vertical-axis represents D 


III. Property of complex networks generated by 
CDPAM 

In this section, we numerically calculate various measures 
which include clustering coefficient, assortativity, algebraic 
connectivity, and spectral radius for the complex networks 
generated by CDPAM. These measures determine various 
topological features of the network and enable to compare 
how the proposed model captures the property of different real 
networks. We also compare values of these measures with that 
of complex network generated by BA model. We have used 
MATLAB R2012a for the numerical simulations. 

A. Clustering coefficient 

Clustering coefficient (CC) of a node signifies the local edge 
density among the neighbors of the node. The CC of a network 
is the average of CC of all the nodes. Thus, for a network TV, 

cc <‘>=*JG) an<icc(w)= wi? ccw 

where \Ei\ denotes the number of links adjacent to a node 
i of a network 0. It is evident that 0 < CC(N ) < 1 for 
any network TV. In Fig. [3j we plot the CC of different size of 
complex networks generated by CDPAM with different values 
of P and 0 = 0.5. It shows that as the value of f3 increases the 
CC of the network decreases and eventually when ft is very 
large, the CC is close to the CC of the network generated by 
BA model. The Fig [4] shows that the CC gets close to 0.8 as 
log P gets close to zero. Thus, we conclude that, in CDPAM 
model, if link is formed by giving equal weights to local and 
global properties of the existing node then the CC gets close 
to 0.8 which is a property of a large class of real networks like 
ego-Facebook network, ego-Gplus network, ego-Twitter (5). 


B. Assortativity index 

The Assortative Index (AI) of a network TV is defined by 

ai(n) = ~ 

J2ij(kAj - -$■)kikj 

where is the ij -th entry of the adjacency matrix associated 
with TV, Sij is the Kronecker delta function (23). Obviously 
— 1 < AI(N ) < 1. A positive value of AI(N) signifies nodes 
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Fig. 3: Horizontal-axis represents size of the network and Vertical-axis Fig. 5: Horizontal-axis represents size of the network and Vertical-axis 
represents clustering coefficient represents assortativity of the network 




Fig. 4: Horizontal-axis represents log(beta) and Vertical-axis represents Fig. 6: Horizontal-axis represents size of the network and Vertical-axis 
clustering coefficient of the network represents triangle count of the network 


with similar degree nodes are linked whereas a negative value 
of AI(N) implies that similar degree nodes are not linked. 

Consider the network N after addition of t nodes to the 
given small network.Then by ([3]), it follows that degree of 
a not is a decreasing function of timestep of its appearance. 
Further, if a node j which appeared in the network at the 
tj timestep has probability p l - to get linked with an existing 
node i appeared at U < tj , is a decreasing function in both 
U and tj, see 0- These indicate, the probability of having 
a link between high degree nodes is larger compared to the 
probability of having a link in between low degree nodes. 
Therefore, we conclude that the network is assortative for 
higher degree nodes and disassortative for low degree nodes. 
Since the network has a few high degree nodes, overall the 
network is disassortative. The plots given in Fig [5] assert the 
same for different values of f3 and 6 = 0.5. We mention here 
disassortative phenomena of networks occur in a large class 
of real networks including World-Wide-Web fTTl . Marine food 
web (24) . freshwater food web (251 . 

C. Number of triangles 

A triangle is a cycle with three nodes. The number of 
triangles is a fundamental building block for many real net¬ 
works. In a social network, if nodes are human beings and 
links are described by friendship relation, then the a triangle 
means friends of a friend are friends. Often real networks 
consists of a huge number of triangles which could be both 
homogeneous and heterogeneous (26). In Fig [6] we show 
that the proposed complex networks by CDPAM contain huge 


number of triangles compared to a network constructed by 
the BA model for example ego-Facebook network, ego-Gplus 
network, ego-Twitter 0 . 

D. Algebraic connectivity 

Algebraic connectivity of a network N is the second largest 
eigenvalue of the Laplacian matrix L(N) = D(N) — A(N) 
associated with the network where D(N ) = diag{&i,..., k n } 
denotes the degree matrix and A(N) is the adjacency matrix 
of the network E7 i. Obviously, L(N) is a symmetric pos¬ 
itive semi-definite matrix. It is well known that the second 
eigenvalue A 2 of L(N) is positive if and only if N is 
connected. More importantly, A 2 determines the robustness 
of a network, i.e. larger the value of A 2 , the more difficult 
to make the network disconnected by removal of nodes or 
edges (27). In particular, if /jl(N) and 77 (N) denote the vertex 
and edge connectivity of a network N respectively, then 
A 2 < /i(A l) < rj(N). We show in Fig [ 7 ] that if a complex 
network is produced by CDPAM after setting /? « 0, that is 
giving almost equal weighage to both local and global property 
of the existing nodes, then the network has higher algebraic 
connectivity than that of a network produced by the BA model. 

E. Spectral radius 

Spectral radius of a network is the maximum modulus of 
eigenvalues of the network. In (28) it has been shown that the 
reciprocal of the spectral radius decides the threshold of virus 
propagation in the network. The smaller the spectral radius is, 
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Fig. 7: Horizontal-axis represents size of the network and Vertical-axis 
represents algebraic connectivity of the network 



Fig. 8: Horizontal-axis represents size of the network and Vertical-axis 
represents spectral radius of the network 


the larger the robustness of a network against the spread of 
viruses lf28l . 

In Fig. [8] we plot the spectral radius of networks generated by 
CDPAM and compared with BA model. Real world networks 
show considerable larger spectral radius compared to BA 
model. CDPAM is capable to inherit large spectral radius 
as many real world networks including Dutch soccer team 
network (28), Dutch roadmap network (29), Internet graph at 
the IP-level 130) and the Autonomous System level ED. 

IV. Conclusion 

In the literature of social choice theory and management 
science it has been established that the choice of a person get 
influenced by a given offered set and ultimately, the choice 
is determined by the local and global contexts of the items 
in the offered set. Inspired by this concept, we introduced a 
preferential attachment model for generating growing complex 
networks when the preference of a new node to get linked 
with old nodes in a network is determined by local and global 
properties of the old nodes. We call the model, the context 
dependent preferential attachment model (CDPAM) and the 
local property is given by the degree of a node, the global 
property is given by the relative average degree of the old 
nodes. We proved that the complex networks generated by 
CDPAM have power law degree distribution and expected 
diameter depends logarithmically with the size of new nodes 
added in the network. In contrast to the general intuition 
that diameter grows with the addition of new nodes, we 
numerically showed that, in the CDPAM model, the expected 


diameter stabilizes when the new nodes get linked by giving 
alike importance (weight) to both local and global property of 
the old nodes. 

In order to investigate how the complex networks generated 
by CDPAM and BA models are related, we calculated clus¬ 
tering coefficient, assortativity, number of triangles, algebraic 
connectivity, spectral radius for both the models. We compared 
these measures and concluded that BA model is a limiting 
case of CDPAM when new nodes tend to give large weight to 
the local property compared to the weight given to the global 
property during link formation. By using these measures, 
we showed that the CDPAM captures the properties of real 
networks better than BA model. 

An interesting question is: can communities emerge in 
CDPAM? We believe that communities will also emerge when 
the weights to the local and global properties will not be 
constant for all new nodes but vary with the new nodes. We 
plan to investigate this phenomenon in future. 
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